是否可以指导基础模型执行涉及法律推理的任务?我们认为,建立一个基准来回答这个问题将需要计算机科学与法律社区之间持续的合作努力。为此,这份简短的纸张有三个目的。首先,我们描述了IRAC-A框架法律学者如何用来区分不同类型的法律推理 - can指导基础模型的基础基准。其次,我们介绍了根据此框架构建的44个任务的种子集。我们讨论初始发现,并突出显示新任务的方向。最终,由开放科学运动引起的启发 - 我们呼吁法律和计算机科学社区通过贡献新任务来加入我们的努力。这项工作正在进行中,我们的进度可以在此处跟踪:https://github.com/hazyresearch/legalbench。
本文介绍了一种新的,高度结果的设置,用于将计算机视觉用于环境可持续性。浓缩动物喂养行动(CAFO)(又称密集牲畜农场或“工厂农场”)产生了巨大的肥料和污染。在冬季,倾倒粪便构成了重大的环境风险,并在许多州违反了环境法。然而,联邦环境保护署(EPA)和州机构主要依靠自我报告来监视此类“土地应用”。我们的论文做出了四个贡献。首先,我们介绍了CAFO和土地应用的环境,政策和农业环境。其次,我们提供了一个新的高效率数据集(每天至每周至每周)3M/像素卫星图像,从2018 - 20年使用威斯康星州的330个CAFO,并带有手工标记的土地应用实例(n = 57,697)。第三,我们开发了一个对象检测模型,以预测土地应用和一个系统以实时进行推断。我们表明,该系统似乎有效地检测到土地应用(PR AUC = 0.93),并且我们发现了几个异常设施,这些设施似乎定期适用。最后,我们估计2021/22冬季土地应用事件的人口流行率。我们表明,土地应用的普遍性要比设施自我报告的要高得多。该系统可以由环境监管机构和利益集团使用,该系统是在过去冬天根据该系统进行的试点探访的。总体而言,我们的应用程序展示了基于AI的计算机视觉系统解决环境符合近日图像的主要问题的潜力。
这项研究研究了在美国国税局(IRS)为税收审计选择的系统中,算法公平性问题。尽管算法公平的领域主要围绕着像个人一样对待的概念发展,但我们却探索了垂直平等的概念 - 适当地考虑到个人之间的相关差异 - 这在许多公共政策环境中都是公平性的核心组成部分。应用于美国个人所得税体系的设计,垂直权益与不同收入水平的纳税人之间的税收和执法负担的公平分配有关。通过与财政部和国税局的独特合作,我们使用匿名个人纳税人微型数据,风险选择的审计以及2010 - 14年度的随机审计来研究税务管理的垂直平等。特别是,我们评估了现代机器学习方法选择审核的使用如何影响垂直权益。首先,我们展示了更灵活的机器学习(分类)方法(而不是简单的模型)如何将审计负担从高收入纳税人转移到中等收入纳税人。其次,我们表明,尽管现有的算法公平技术可以减轻跨收入的某些差异,但它们可能会造成巨大的绩效成本。第三,我们表明,是否将低报告的风险视为分类或回归问题的选择是高度的。从分类转变为回归模型,以预测不足的审计转变会大大向高收入个人转移,同时增加收入。最后,我们探讨了差异审计成本在塑造审计分配中的作用。我们表明,对回报的狭窄关注会破坏垂直权益。我们的结果对整个公共部门的算法工具的设计具有影响。
We introduce a new setting, optimize-and-estimate structured bandits. Here, a policy must select a batch of arms, each characterized by its own context, that would allow it to both maximize reward and maintain an accurate (ideally unbiased) population estimate of the reward. This setting is inherent to many public and private sector applications and often requires handling delayed feedback, small data, and distribution shifts. We demonstrate its importance on real data from the United States Internal Revenue Service (IRS). The IRS performs yearly audits of the tax base. Two of its most important objectives are to identify suspected misreporting and to estimate the "tax gap" -- the global difference between the amount paid and true amount owed. Based on a unique collaboration with the IRS, we cast these two processes as a unified optimize-and-estimate structured bandit. We analyze optimize-and-estimate approaches to the IRS problem and propose a novel mechanism for unbiased population estimation that achieves rewards comparable to baseline approaches. This approach has the potential to improve audit efficacy, while maintaining policy-relevant estimates of the tax gap. This has important social consequences given that the current tax gap is estimated at nearly half a trillion dollars. We suggest that this problem setting is fertile ground for further research and we highlight its interesting challenges. The results of this and related research are currently being incorporated into the continual improvement of the IRS audit selection methods.
集中的动物饲养业务(CAFOS)对空气,水和公共卫生构成严重风险,但已被证明挑战规范。美国政府问责办公室注意到基本挑战是缺乏关于咖啡馆的全面的位置信息。我们使用美国农业部的国家农产病程(Naip)1M / Pixel Acial Imagerery来检测美国大陆的家禽咖啡馆。我们培养卷积神经网络(CNN)模型来识别单个家禽谷仓,并将最佳表现模型应用于超过42 TB的图像,以创建家禽咖啡座的第一个国家开源数据集。我们验证了来自加利福尼亚州的10个手标县的家禽咖啡馆设施的模型预测,并证明这种方法具有填补环境监测中差距的显着潜力。
Previous work has shown the potential of deep learning to predict renal obstruction using kidney ultrasound images. However, these image-based classifiers have been trained with the goal of single-visit inference in mind. We compare methods from video action recognition (i.e. convolutional pooling, LSTM, TSM) to adapt single-visit convolutional models to handle multiple visit inference. We demonstrate that incorporating images from a patient's past hospital visits provides only a small benefit for the prediction of obstructive hydronephrosis. Therefore, inclusion of prior ultrasounds is beneficial, but prediction based on the latest ultrasound is sufficient for patient risk stratification.
Remote state estimation of large-scale distributed dynamic processes plays an important role in Industry 4.0 applications. In this paper, we focus on the transmission scheduling problem of a remote estimation system. First, we derive some structural properties of the optimal sensor scheduling policy over fading channels. Then, building on these theoretical guidelines, we develop a structure-enhanced deep reinforcement learning (DRL) framework for optimal scheduling of the system to achieve the minimum overall estimation mean-square error (MSE). In particular, we propose a structure-enhanced action selection method, which tends to select actions that obey the policy structure. This explores the action space more effectively and enhances the learning efficiency of DRL agents. Furthermore, we introduce a structure-enhanced loss function to add penalties to actions that do not follow the policy structure. The new loss function guides the DRL to converge to the optimal policy structure quickly. Our numerical experiments illustrate that the proposed structure-enhanced DRL algorithms can save the training time by 50% and reduce the remote estimation MSE by 10% to 25% when compared to benchmark DRL algorithms. In addition, we show that the derived structural properties exist in a wide range of dynamic scheduling problems that go beyond remote state estimation.
